Skip to content

Conversation

@ldionne
Copy link
Member

@ldionne ldionne commented Nov 20, 2025

When closing a filebuf, we would previously call setbuf(0, 0). Not only does this get rid of any underlying buffer set by the user, but it also sets the unbuffered mode.

This means that if the filebuf is then reopened to keep writing to a file, it would have lost track of the original user-provided buffer (or the default one which has a size of 4096), and instead it would use the unbuffered mode, which is terribly slow.

This led to a bug report where users were complaining that closing and reopening the filebuf led to a significantly worse performance than using it without having closed and reopened. While this is a slightly unusual usage pattern, it should definitely work.

rdar://161833214

When closing a filebuf, we would previously call setbuf(0, 0). Not
only does this get rid of any underlying buffer set by the user, but
it also sets the unbuffered mode.

This means that if the filebuf is then reopened to keep writing to a
file, it would have lost track of the original user-provided buffer
(or the default one which has a size of 4096), and instead it would
use the unbuffered mode, which is terribly slow.

This led to a bug report where users were complaining that closing and
reopening the filebuf led to a significantly worse performance than
using it without having closed and reopened. While this is a slightly
unusual usage pattern, it should definitely work.

rdar://161833214
@ldionne ldionne requested a review from a team as a code owner November 20, 2025 21:14
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Nov 20, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 20, 2025

@llvm/pr-subscribers-libcxx

Author: Louis Dionne (ldionne)

Changes

When closing a filebuf, we would previously call setbuf(0, 0). Not only does this get rid of any underlying buffer set by the user, but it also sets the unbuffered mode.

This means that if the filebuf is then reopened to keep writing to a file, it would have lost track of the original user-provided buffer (or the default one which has a size of 4096), and instead it would use the unbuffered mode, which is terribly slow.

This led to a bug report where users were complaining that closing and reopening the filebuf led to a significantly worse performance than using it without having closed and reopened. While this is a slightly unusual usage pattern, it should definitely work.

rdar://161833214


Full diff: https://github.com/llvm/llvm-project/pull/168947.diff

2 Files Affected:

  • (modified) libcxx/include/fstream (+6-1)
  • (added) libcxx/test/std/input.output/file.streams/fstreams/filebuf.members/close.dont-get-rid-of-buffer.pass.cpp (+132)
diff --git a/libcxx/include/fstream b/libcxx/include/fstream
index 90e35740c17cf..acce9a026abf1 100644
--- a/libcxx/include/fstream
+++ b/libcxx/include/fstream
@@ -778,7 +778,12 @@ basic_filebuf<_CharT, _Traits>* basic_filebuf<_CharT, _Traits>::close() {
     if (fclose(__h.release()))
       __rt = nullptr;
     __file_ = nullptr;
-    setbuf(0, 0);
+    // Reset the get and the put areas without nonetheless getting rid of the
+    // underlying buffers, which might have been configured by the user who might
+    // reopen the stream.
+    this->setg(nullptr, nullptr, nullptr);
+    this->setp(nullptr, nullptr);
+    __cm_ = __no_io_operations;
   }
   return __rt;
 }
diff --git a/libcxx/test/std/input.output/file.streams/fstreams/filebuf.members/close.dont-get-rid-of-buffer.pass.cpp b/libcxx/test/std/input.output/file.streams/fstreams/filebuf.members/close.dont-get-rid-of-buffer.pass.cpp
new file mode 100644
index 0000000000000..6c2dcefdaaa37
--- /dev/null
+++ b/libcxx/test/std/input.output/file.streams/fstreams/filebuf.members/close.dont-get-rid-of-buffer.pass.cpp
@@ -0,0 +1,132 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+// This test requires the fix to std::filebuf::close() (which is defined in the
+// built library) from PR XXX.
+// UNSUPPORTED: using-built-library-before-llvm-22
+
+// <fstream>
+
+// basic_filebuf<charT,traits>* close();
+
+//
+// Ensure that basic_filebuf::close() does not get rid of the underlying buffer set
+// via pubsetbuf(). Otherwise, reopening the stream will result in not reusing the
+// same buffer, which might be conforming but is definitely surprising. The standard
+// is not very clear on whether that is actually conforming.
+//
+
+#include <cassert>
+#include <fstream>
+#include <string>
+
+#include "platform_support.h"
+#include "test_macros.h"
+
+struct overflow_detecting_filebuf : std::filebuf {
+  explicit overflow_detecting_filebuf(bool* overflow_monitor) : did_overflow_(overflow_monitor) {
+    assert(overflow_monitor != nullptr && "must provide an overflow monitor");
+  }
+
+  using Traits = std::filebuf::traits_type;
+  virtual std::filebuf::int_type overflow(std::filebuf::int_type ch = Traits::eof()) {
+    *did_overflow_ = true;
+    return std::filebuf::overflow(ch);
+  }
+
+private:
+  bool* did_overflow_;
+};
+
+int main(int, char**) {
+  {
+    std::string temp = get_temp_file_name();
+
+    bool did_overflow;
+    overflow_detecting_filebuf buf(&did_overflow);
+
+    // Set a custom buffer (of size 32, reused below)
+    char underlying_buffer[32];
+    buf.pubsetbuf(underlying_buffer, sizeof(underlying_buffer));
+
+    // (1) Open a file and insert a first character. That should overflow() and set the underlying
+    //     put area to our internal buffer set above.
+    {
+      buf.open(temp, std::ios::out | std::ios::trunc);
+      did_overflow = false;
+      buf.sputc('c');
+      assert(did_overflow == true);
+    }
+
+    // (2) Now, confirm that we can still insert 30 more characters without calling
+    //     overflow, since we should be writing to the internal buffer.
+    {
+      did_overflow = false;
+      for (int i = 0; i != 30; ++i) {
+        buf.sputc('c');
+        assert(did_overflow == false);
+      }
+    }
+
+    // (3) Writing the last character may or may not call overflow(), depending on whether
+    //     the library implementation wants to flush as soon as the underlying buffer is
+    //     full, or on the next attempt to insert. For libc++, it doesn't overflow yet.
+    {
+      did_overflow = false;
+      buf.sputc('c');
+      LIBCPP_ASSERT(!did_overflow);
+    }
+
+    // (4) Writing one-too-many characters will overflow (with libc++).
+    {
+      did_overflow = false;
+      buf.sputc('c');
+      LIBCPP_ASSERT(did_overflow);
+    }
+
+    // Close the stream. This should NOT unset the underlying buffer we set at the beginning
+    // Unfortunately, the only way to check that is to repeat the above tests which tries to
+    // tie the presence of our custom set buffer to whether overflow() gets called. This is
+    // not entirely portable since implementations are free to call overflow() whenever they
+    // want, but in practice this works pretty portably.
+    buf.close();
+
+    // Repeat (1)
+    {
+      buf.open(temp, std::ios::out | std::ios::trunc);
+      did_overflow = false;
+      buf.sputc('c');
+      assert(did_overflow == true);
+    }
+
+    // Repeat (2)
+    {
+      did_overflow = false;
+      for (int i = 0; i != 30; ++i) {
+        buf.sputc('c');
+        assert(did_overflow == false);
+      }
+    }
+
+    // Repeat (3)
+    {
+      did_overflow = false;
+      buf.sputc('c');
+      LIBCPP_ASSERT(!did_overflow);
+    }
+
+    // Repeat (4)
+    {
+      did_overflow = false;
+      buf.sputc('c');
+      LIBCPP_ASSERT(did_overflow);
+    }
+  }
+
+  return 0;
+}

Comment on lines +781 to +783
// Reset the get and the put areas without nonetheless getting rid of the
// underlying buffers, which might have been configured by the user who might
// reopen the stream.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Reset the get and the put areas without nonetheless getting rid of the
// underlying buffers, which might have been configured by the user who might
// reopen the stream.
// Reset the get and the put areas without getting rid of the
// underlying buffers, which might have been configured by the user.
// Make sure to keep the buffer, since the user may re-open the stream.

I think is what you wanted to say?

Comment on lines +20 to +21
// same buffer, which might be conforming but is definitely surprising. The standard
// is not very clear on whether that is actually conforming.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the standard is quite clear. setbuf has implementation-defined behaviour except for setbuf(0, 0) (https://eel.is/c++draft/input.output#filebuf.virtuals-12), so we can do whatever we want in terms of buffering and when we throw away buffers. Speaking of, this seems to want an entry in ImplementationDefinedBehavior.rst.

};

int main(int, char**) {
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the extra scope?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants